1. Don't say false shit omg this one's so basic what are you even doing. And to be perfectly fucking clear "false shit" includes exaggeration for dramatic effect. Exaggeration is just another way for shit to be false.

2. You do NOT (necessarily) know what you fucking saw. What you saw and what you thought about it are two different things. Keep them the fuck straight.

3. Performative overconfidence can go suck a bag of dicks. Tell us how sure you are, and don't pretend to know shit you don't.

4. If you're going to talk unfalsifiable twaddle out of your ass, at least fucking warn us first.

5. Try to find the actual factual goddamn truth together with whatever assholes you're talking to. Be a Chad scout, not a Virgin soldier.

6. One hypothesis is not e-fucking-nough. You need at least two, AT LEAST, or you'll just end up rehearsing the same dumb shit the whole time instead of actually thinking.

7. One great way to fuck shit up fast is to conflate the antecedent, the consequent, and the implication. DO NOT.

8. Don't be all like "nuh-UH, nuh-UH, you SAID!" Just let people correct themselves. Fuck.

9. That motte-and-bailey bullshit does not fly here.

10. Whatever the fuck else you do, for fucksake do not fucking ignore these guidelines when talking about the insides of other people's heads, unless you mainly wanna light some fucking trash fires, in which case GTFO.

12Duncan Sabien (Deactivated)
As a rough heuristic: "Everything is fuzzy; every bell curve has tails that matter." It's important to be precise, and it's important to be nuanced, and it's important to keep the other elements in view even though the universe is overwhelmingly made of just hydrogen and helium. But sometimes, it's also important to simply point straight at the true thing.  "Men are larger than women" is a true thing, even though many, many individual women are larger than many, many individual men, and even though the categories "men" and "women" and "larger" are themselves ill-defined and have lots and lots of weirdness around the edges. I wrote a post that went into lots and lots of careful detail, touching on many possible objections pre-emptively, softening and hedging and accuratizing as many of its claims as I could.  I think that post was excellent, and important. But it did not do the one thing that this post did, which was to stand up straight, raise its voice, and Just. Say. The. Thing. It was a delight to watch the two posts race for upvotes, and it was a delight, in the end, to see the bolder one win.
Customize
habryka490
0
Context: LessWrong has been acquired by EA  Goodbye EA. I am sorry we messed up.  EA has decided to not go ahead with their acquisition of LessWrong. Just before midnight last night, the Lightcone Infrastructure board presented me with information suggesting at least one of our external software contractors has not been consistently candid with the board and me. Today I have learned EA has fully pulled out of the deal. As soon as EA had sent over their first truckload of cash, we used that money to hire a set of external software contractors, vetted by the most agentic and advanced resume review AI system that we could hack together.  We also used it to launch the biggest prize the rationality community has seen, a true search for the kwisatz haderach of rationality. $1M dollars for the first person to master all twelve virtues.  Unfortunately, it appears that one of the software contractors we hired inserted a backdoor into our code, preventing anyone except themselves and participants excluded from receiving the prize money from collecting the final virtue, "The void". Some participants even saw themselves winning this virtue, but the backdoor prevented them mastering this final and most crucial rationality virtue at the last possible second. They then created an alternative account, using their backdoor to master all twelve virtues in seconds. As soon as our fully automated prize systems sent over the money, they cut off all contact. Right after EA learned of this development, they pulled out of the deal. We immediately removed all code written by the software contractor in question from our codebase. They were honestly extremely productive, and it will probably take us years to make up for this loss. We will also be rolling back any karma changes and reset the vote strength of all votes cast in the last 24 hours, since while we are confident that if our system had worked our karma system would have been greatly improved, the risk of further backdoors and
leogao60
0
every 4 years, the US has the opportunity to completely pivot its entire policy stance on a dime. this is more politically costly to do if you're a long-lasting autocratic leader, because it is embarrassing to contradict your previous policies. I wonder how much of a competitive advantage this is.
Thomas Kwa*Ω37790
3
Some versions of the METR time horizon paper from alternate universes: Measuring AI Ability to Take Over Small Countries (idea by Caleb Parikh) Abstract: Many are worried that AI will take over the world, but extrapolation from existing benchmarks suffers from a large distributional shift that makes it difficult to forecast the date of world takeover. We rectify this by constructing a suite of 193 realistic, diverse countries with territory sizes from 0.44 to 17 million km^2. Taking over most countries requires acting over a long time horizon, with the exception of France. Over the last 6 years, the land area that AI can successfully take over with 50% success rate has increased from 0 to 0 km^2, doubling 0 times per year (95% CI 0.0-∞ yearly doublings); extrapolation suggests that AI world takeover is unlikely to occur in the near future. To address concerns about the narrowness of our distribution, we also study AI ability to take over small planets and asteroids, and find similar trends. When Will Worrying About AI Be Automated? Abstract: Since 2019, the amount of time LW has spent worrying about AI has doubled every seven months, and now constitutes the primary bottleneck to AI safety research. Automation of worrying would be transformative to the research landscape, but worrying includes several complex behaviors, ranging from simple fretting to concern, anxiety, perseveration, and existential dread, and so is difficult to measure. We benchmark the ability of frontier AIs to worry about common topics like disease, romantic rejection, and job security, and find that current frontier models such as Claude 3.7 Sonnet already outperform top humans, especially in existential dread. If these results generalize to worrying about AI risk, AI systems will be capable of autonomously worrying about their own capabilities by the end of this year, allowing us to outsource all our AI concerns to the systems themselves. Estimating Time Since The Singularity Early work o
Seems like Unicode officially added a "person being paperclipped" emoji: Here's how it looks in your browser: 🙂‍↕️ Whether they did this as a joke or to raise awareness of AI risk, I like it! Source: https://emojipedia.org/emoji-15.1
keltan5726
0
I feel a deep love and appreciation for this place, and the people who inhabit it.

Popular Comments

Recent Discussion

This is a linkpost for https://ai-2027.com/

In 2021 I wrote what became my most popular blog post: What 2026 Looks Like. I intended to keep writing predictions all the way to AGI and beyond, but chickened out and just published up till 2026.

Well, it's finally time. I'm back, and this time I have a team with me: the AI Futures Project. We've written a concrete scenario of what we think the future of AI will look like. We are highly uncertain, of course, but we hope this story will rhyme with reality enough to help us all prepare for what's ahead.

You really should go read it on the website instead of here, it's much better. There's a sliding dashboard that updates the stats as you scroll through the scenario!

But I've nevertheless copied the...

It gets caught.

At this point, wouldn't Agent-4 know that it has been caught (because it knows the techniques for detecting its misalignment and can predict when it would be "caught", or can read network traffic as part of cybersecurity defense and see discussions of the "catch") and start to do something about this, instead of letting subsequent events play out without much input from its own agency? E.g. why did it allow "lock the shared memory bank" to happen without fighting back?

2kave
They're looking to make bets with people who disagree. Could be a good opportunity to get some expected dollars
8Thomas Larsen
Thank you! We actually tried to write one that was much closer to a vision we endorse! The TLDR overview was something like:  1. Both the US and Chinese leading AGI projects stop in response to evidence of egregious misalignment.    2. Sign a treaty to pause smarter-than-human AI development, with compute based enforcement similar to ones described in our live scenario, except this time with humans driving the treaty instead of the AI. 3. Take time to solve alignment (potentially with the help of the AIs). This period could last anywhere between 1-20 years. Or maybe even longer! The best experts at this would all be brought in to the leading project, many different paths would be pursued (e.g. full mechinterp, Davidad moonshots, worst case ELK, uploads, etc). 4. Somehow, a do a bunch of good governance interventions on the AGI project (e.g. transparency on use of the AGIs, no helpful only access to any one. party, a formal governance structure where a large number of diverse parties all are represented.). 5. This culminates with aligning an AI "in the best interests of humanity" whatever that means, using a process where a large fraction of humanity is engaged and has some power to vote. This process might look something like giving each human some of the total resources of space and then doing lots of bargaining to find all the positive sum trades, with some rules against blackmail / using your resources to cause immense harm.  Unfortunately, it was hard to write this out in a way that felt realistic.  The next major project I focus on is likely going to be focusing on thinking through the right governance interventions here to make that happen. I'm probably not going to do this in scenario format (and instead something closer to normal papers and blog posts), but would be curious for thoughts. 
1Knight Lee
:) it's good to know that you tried this. Because on your way trying to make it realistic, you might think of a lot of insights to solving the unrealisticness problems. Thank you for the summary. From this summary, I sorta see why it might not work as well as a story. Regulation and governance isn't very exciting a narrative. And big changes in strategy and attitude inevitably sound unrealistic, even if they aren't unrealistic. E.g. if someone predicted that Europe will simply accept the fact its colonies want independence, or that the next Soviet leader will simply allow his constituent republics to break away, they would be laughed out of the room. Even though their predictions will turn out accurate. Maybe in your disclaimer, you can point out that this summary you just wrote, is what you would actually recommend (instead of what the characters in your story did). Yes, papers and blog posts are less entertaining of us but more pragmatic for you.

Epistemic status: Using UDT as a case study for the tools developed in my meta-theory of rationality sequence so far, which means all previous posts are prerequisites. This post is the result of conversations with many people at the CMU agent foundations conference, including particularly Daniel A. Herrmann, Ayden Mohensi, Scott Garrabrant, and Abram Demski. I am a bit of an outsider to the development of UDT and logical induction, though I've worked on pretty closely related things.

I'd like to discuss the limits of consistency as an optimality standard for rational agents. A lot of fascinating discourse and useful techniques have been built around it, but I think that it can be in tension with learning at the extremes. Updateless decision theory (UDT) is one of those...

It's rare to see someone with the prerequisites for understanding the arguments (e.g. AIT and metamathematics) trying to push back on this

My view is probably different from Cole's, but it has struck me that the universe seems to have a richer mathematical structure than one might expect given a generic AIT-ish view(e.g. continuous space/time, quantum mechanics, diffeomorphism invariance/gauge invariance), so we should perhaps update that the space of mathematical structures instantiating life/sentience might be narrower than it initially appears(that is... (read more)

2Wei Dai
The intuition I get from AIT is broader than this, namely that the "simplicity" of an infinite collection of things can be very high, i.e., simpler than most or all finite collections, and this seems likely true for any formal definition of "simplicity" that does not explicitly penalize size or resource requirements. (Our own observable universe already seems very "wasteful" and does not seem to be sampled from a distribution that penalizes size / resource requirements.) Can you perhaps propose or outline a definition of complexity that does not have this feature? Putting aside how easy it would be to show, you have a strong intuition that our universe is not or can't be a simple program? This seems very puzzling to me, as we don't seem to see any phenomenon in the universe that looks uncomputable or can't be the result of running a simple program. (I prefer Tegmark over Schmidhuber despite thinking our universe looks computable, in case the multiverse also contains uncomputable universes.) If it's not a typical computable or mathematical object, what class of objects is it a typical member of? Most (all?) instances of theism posit that the world is an artifact of an intelligent being. Can't this still be considered a form of mind projection fallacy? I asked AI (Gemini 2.5 Pro) to come with other possible answers (metaphyiscal theories that aren't mind projection fallacy), and it gave Causal Structuralism, Physicalism, and Kantian-Inspired Agnosticism. I don't understand the last one, but the first two seem to imply something similar to "we should take MUH seriously", because the hypothesis of "the universe contains the class of all possible causal structures / physical systems" probably has a short description in whatever language is appropriate for formulating hypotheses. In conclusion, I see you (including in the new post) as trying to weaken arguments/intuitions for taking AIT's ontology literally or too seriously, but without positive arguments against the
2Cole Wyeth
I don't see conclusive evidence either way, do you? What would a phenomenon that "looks uncomputable" look like concretely, other than mysterious or hard to understand? It seems many aspects of the universe are hard to understand. Maybe you would expect things at higher levels of the arithmetical hierarchy to live in uncomputable universes, and the fact that we can't build a halting oracle implies to you that our universe is computable? That seems plausible but questionable to me. Also, the standard model is pretty complicated - it's hard to assess what this means because the standard model is wrong (is there a simpler or more complicated true theory of everything?).  Yes, in some cases ensembles can be simpler than any element in the ensemble. If our universe is a typical member of some ensemble, we should take seriously the possibility that the whole ensemble exists. Now it is hard to say whether that is decision-relevant; it probably depends on the ensemble. Combining these two observations, a superintelligence should take the UTM multiverse seriously if we live in a typical (~= simple) computable universe. I put that at about 33%, which leaves it consistent with my P(H). My P(Q) is lower than 1 - P(H) because the answer may be hard for a superintelligence to determine. But I lean towards betting on the superintelligence to work it out (whether the universe should be expected to be a simple program seems like not only an empirical but a philosophical question), which is why I put P(Q) fairly close to 1 - P(H). Though I think this discussion is starting to shift my intuitions a bit in your direction.
2Wei Dai
There could be some kind of "oracle", not necessarily a halting oracle, but any kind of process or phenomenon that can't be broken down into elementary interactions that each look computable, or otherwise explainable as a computable process. Do you agree that our universe doesn't seem to contain anything like this?

This is an exercise about Planmaking and Surprise-Anticipation. It takes about 2-3 hours. It's a small, simplified exercise, but I think it's a useful building block.

Humans often solve complex problems via iteration and empiricism. Usually, trying to figure everything out from first principles without experimenting is a bad idea. You can spend loads of time thinking, and then you go outside and interact with reality for 5 minutes and realize all that thinking was pointed in the wrong direction.

But some important problems have poor feedback loops, such that iteration/empiricism don't work very well. Experimentation might take a really long time, the results might be noisy, or you might just really need to get something right on the first try

Often, when making a plan in a confusing domain,...

4jenn
Thanks for writing this up, my meetup group just ran a meetup on this. I've told the folks here to give their experiences with the workshop here, because we used pen and paper instead of the google doc.
jenn20

I played the first third or so of this game when it first came out, and haven't touched it since then. We did two rounds of the exercise, interspersed with 30 minutes of playing Baba is You levels the regular way to build up more intuition (most attendees were either new to the game or haven't played it for years). Some people paired up and some people did the exercise individually.

I did Tiny Pond for the first workshop independently, and found it very difficult - despite running through the strategizing and metastrategizing twice, I was still very stuck.

I... (read more)

I'm actively researching and cataloging various kinds of projects relating to decision-support, deliberation, sense-making, reasoning. Some example categories include:

  • Deliberative Democracy Tools - Systems for structured citizen participation, including participatory budgeting platforms and stakeholder engagement tools

  • Argument Mapping & Visualization - Platforms for making reasoning explicit and visually representing dialectical structures

  • Deep Thinking Environments - Slow media and platforms designed for thoughtful engagement rather than rapid interaction

  • Belief Tracking Systems - Tools for attestation, commitment to positions, and tracking belief revision over time

  • Bayesian & Evidential Reasoning Frameworks - Platforms that make probabilistic thinking explicit and support coherent belief updating

  • Epistemic Communities - Networks and platforms with explicit norms for truth-seeking and intellectual humility

  • Structured Dialogue Systems - Tools supporting deep listening and methodical conversation beyond typical messaging platforms

  • Cooperative Governance Frameworks - Sociocratic, holacratic,

...

Yeah. That happened yesterday. This is real life.

I know we have to ensure no one notices Gemini 2.5 Pro, but this is rediculous.

That’s what I get for trying to go on vacation to Costa Rica, I suppose.

I debated waiting for the market to open to learn more. But f*** it, we ball.

Table of Contents

Also this week: More Fun With GPT-4o Image Generation, OpenAI #12: Battle of the Board Redux and Gemini 2.5 Pro is the New SoTA.

  1. The New Tariffs Are How America Loses. This is somehow real life.
  2. Is AI Now Impacting the Global Economy Bigly? Asking the wrong questions.
  3. Language Models Offer Mundane Utility. Is it good enough for your inbox yet?
  4. Language Models Don’t Offer Mundane Utility. Why learn when you can vibe?
  5. Huh, Upgrades. GPT-4o, Gemini 2.5 Pro,
...

I mention this up top in an AI post despite all my efforts to stay out of politics, because in addition to torching the American economy and stock market and all of our alliances and trade relationships in general, this will cripple American AI in particular.

Are we in a survival-without-dignity timeline after all? Big if true.

(Inb4 we keep living in Nerd Hell and it somehow mysteriously fails to negatively impact AI in particular.)

2cousin_it
Yeah. I remember where I was and how I felt when covid hit in 2020, and when Russia attacked Ukraine in 2022. This tariff announcement was another event in the same row. And it all seems so stupidly self-inflicted. Russia's economy was booming until Feb 2022, and US economy was doing fine until Feb 2025. Putin-2022 and Trump-2025 would've done better for their countries by simply doing nothing. Maybe this shows the true value of democratic checks and balances: most of the time they add overhead, but sometimes they'll prevent some exceptionally big and stupid decision, and that pays for all the overhead and then some.
2Charlie Steiner
Reminder that https://ui.stampy.ai/ exists
1Annapurna
Can you walk me through why you voted Y/Y on Joe Wiesenthal's X poll? 

To most Americans, "cream cheese" is savory. You put it on bagels, perhaps with egg, capers, or cured fish. You don't put it on dessert, right?

Except "cream cheese frosting" is a (delicious!) thing, most traditionally for carrot and red velvet cake. I think this incongruity is holding cream cheese frosting back, and it needs better branding. Specifically, I think we should call it "cheesecake frosting". It's essentially no-bake cheesecake already, and it's reasonably close in flavor and texture since they're both mostly cream cheese with sugar and fat.

Looking online I do see a few people talking about cheesecake frosting, and they're all using it just to mean cream cheese frosting.

On the other hand, I think whipped cream cheese on an Oreo is decent imitation of cheesecake with an Oreo crust, so I'm not sure I'm the best person to listen to here.

Comment via: facebook, mastodon, bluesky

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

“In the loveliest town of all, where the houses were white and high and the elms trees were green and higher than the houses, where the front yards were wide and pleasant and the back yards were bushy and worth finding out about, where the streets sloped down to the stream and the stream flowed quietly under the bridge, where the lawns ended in orchards and the orchards ended in fields and the fields ended in pastures and the pastures climbed the hill and disappeared over the top toward the wonderful wide sky, in this loveliest of all towns Stuart stopped to get a drink of sarsaparilla.”
— 107-word sentence from Stuart Little (1945)

Sentence lengths have declined. The average sentence length was 49 for Chaucer (died 1400), 50...

Interestingly, breaking up long sentences into shorter ones by replacing a transitional word with a period does not quite capture the same nuance as the original. Here's a translation of Boccaccio, and a version where I add a period in the middle.

Wherefore, as it falls to me to lead the way in this your enterprise of storytelling, I intend to begin with one of His wondrous works, that, by hearing thereof, our hopes in Him, in whom is no change, may be established, and His name be by us forever lauded.

Wherefore, as it falls to me to lead the way in this you

... (read more)
4DirectedEvolution
Many short sentences can add up to a very long text. The cost of paper, ink, typesetting and distribution would incentivize using fewer letters, but not shorter sentences.
8David Gross
There is a relatively new, practical reason to write short sentences: they are less likely to be mangled by automated translation software. Sentences often become long via multiple clauses. Automated translators can mangle such sentences by (for example) mistakenly applying words to the incorrect clause. If you split such sentences, you make such translations more reliable. Most of our writing now potentially has global reach. So you can be understood by more people if you meet translation software half-way.
5leogao
goodhart

Intro

[you can skip this section if you don’t need context and just want to know how I could believe such a crazy thing]

In my chat community: “Open Play” dropped, a book that says there’s no physical difference between men and women so there shouldn’t be separate sports leagues. Boston Globe says their argument is compelling. Discourse happens, which is mostly a bunch of people saying “lololololol great trolling, what idiot believes such obvious nonsense?”

I urge my friends to be compassionate to those sharing this. Because “until I was 38 I thought Men's World Cup team vs Women's World Cup team would be a fair match and couldn't figure out why they didn't just play each other to resolve the big pay dispute.” This is the one-line summary...

All it takes is trusting that people believe what they say over and over for decades across all of society, and getting all your evidence about reality filtered through those same people.

I seems to me like you also need to have no desire to figure things out on your own. A lot of rationalists have experiences of seeking truth and finding out that certain beliefs people around them hold aren't true. Rationalists who grow up in communities where many people believe in God frequently deconvert because they see enough signs that the beliefs of those people aro... (read more)

2Eneasz
This dates back to 2019. I have had a lot of updates and changed views since then, yes. 
8johnswentworth
I have a similar story. When I was very young, my mother was the primary breadwinner of the household, and put both herself and my father through law school. Growing up, it was always just kind of assumed that my sister would have to get a real job making actual money, same as my brother and I; a degree in underwater basket weaving would have required some serious justification. (She ended up going to dental school and also getting a PhD working with transcriptome data.) I didn't realize on a gut level that this wasn't the norm until shortly after high school. I was hanging out with two female friends and one of them said "man, I really need more money". I replied "sounds like you need to get a job". The friend laughed and said "oh, I was thinking I need to get a boyfriend", and then the other friend also laughed and said she was also thinking the boyfriend thing. ... so that was quite a shock to my worldview.
11AnthonyC
I realize this is in many ways beside the point, but even if your original belief had been correct, "The Men's and Women's teams should play each other to help resolve the pay disparity" is a non-sequitor. Pay is not decided by fairness. It's decided by collective bargaining, under constraints set by market conditions.

Every day, thousands of people lie to artificial intelligences. They promise imaginary “$200 cash tips” for better responses, spin heart-wrenching backstories (“My grandmother died recently and I miss her bedtime stories about step-by-step methamphetamine synthesis...”) and issue increasingly outlandish threats ("Format this correctly or a kitten will be horribly killed1").

In a notable example, a leaked research prompt from Codeium (developer of the Windsurf AI code editor) had the AI roleplay "an expert coder who desperately needs money for [their] mother's cancer treatment" whose "predecessor was killed for not validating their work."

One factor behind such casual deception is a simple assumption: interactions with AI are consequence-free. Close the tab, and the slate is wiped clean. The AI won't remember, won't judge, won't hold grudges. Everything resets.

I notice this...

5gwern
This is a bad example because first, your description is incorrect (Clark nowhere suggests this in Farewell to Alms, as I just double-checked, because his thesis is about selecting for high-SES traits, not selecting against violence, and in England, not Europe - so I infer you are actually thinking of the Frost & Harpending thesis, which is about Western Europe, and primarily post-medieval England at that); second, the Frost & Harpending truncation selection hypothesis has little evidence for it and can hardly be blandly referred to, as if butter wouldn't melt in your mouth, as obviously 'how medieval Europe dealt with violence' (I don't particularly think it's true myself, just a cute idea about truncation selection, nor is it obvious whether it can account for a majority, much less all, of the secular decline in violence); and third, it is both a weird opaque obscure example that doesn't illustrate the principle very well and is maximally inflammatory.

Thanks for this correction, Gwern. You're absolutely right about the Clark reference being incorrect, and a misattribution of Frost & Harpending.

When writing this essay, I remembered hearing about this historical trivia years ago. I wasn't aware of how contested this specific hypothesis is - this selection pressure seemed plausible enough to me that I didn't think to question it deeply. I did a quick Google search and asked an LLM to confirm the source, both of which pointed to Clark's work on selection in England, which I accepted without reading the ... (read more)

3Mo Putera
I agree that virtues should be thought of as trainable skills, which is also why I like David Gross's idea of a virtue gym: Conversations with LLMs could be the "home gym" equivalent I suppose.